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ABSTRACT 

The global architecture of the cell nucleus and 
the spatial organization of chromatin play impor- 
tant roles in gene expression and nuclear function. 
Single-cell imaging and chromosome conformation 
capture-based techniques provide a wealth of infor- 
mation on the spatial organization of chromosomes. 
However, a mechanistic model that can account for 
all observed scaling behaviors governing long-range 
chromatin interactions is missing. Here we describe 
a model called constrained self-avoiding chromatin 
(C-SAC) for studying spatial structures of chromo- 
somes, as the available space is a key determinant of 
chromosome folding. We studied large ensembles of 
model chromatin chains with appropriate fiber diam- 
eter, persistence length and excluded volume under 
spatial confinement. We show that the equilibrium 
ensemble of randomly folded chromosomes in the 
confined nuclear volume gives rise to the experimen- 
tally observed higher-order architecture of human 
chromosomes, including average scaling properties 
of mean-square spatial distance, end-to-end dis- 
tance, contact probability and their chromosome-to- 
chromosome variabilities. Our results indicate that 
the overall structure of a human chromosome is dic- 
tated by the spatial confinement of the nuclear space, 
which may undergo significant tissue- and develop- 
mental stage-specific size changes. 



INTRODUCTION 

Human cells must accommodate ~6 billion base pairs of 
deoxyribonucleic acid (DNA) in a small nucleus with a di- 
ameter of 6-20 urn (1). Understanding the spatial organi- 
zation of chromatin within the cell nucleus is key to gaining 
insights into the mechanism of gene activities, nuclear func- 
tions and maintenance of cellular epigenetic states (2). A 



major task is to understand the rules that govern the regu- 
lation of long-range chromatin interactions (2-4). 

Fluorescence in situ hybridization (FISH) and chromo- 
some conformation capture (3C) and related techniques 
revealed a wealth of information about spatial chromatin 
structures across different genomic regions for a variety of 
cell types (35-10). A key outcome of FISH experiments is 
the relationship between the mean-square spatial distance, 
R 2 , and the genomic distance, s, of two chromosome loci 
(5-69). The folded structures of chromatin fibers follow a 
scaling relationship of R 2 (s) ~ s 2v . In human Chr 1 and 11, 
the exponent v is ~0.33 at smaller genomic (0.4-2 Mbp) 
distances, but levels off (v ~ 0) at larger genomic distances 
(>10 Mbp) (6). In mouse Chr 12, v is found to be ~0.25 and 
~0.37 for two different cell types at smaller genomic dis- 
tances (<0.5 Mbp), and levels off at larger distances (>0.5 
Mbp) (9). The leveling-off effects indicate that each chro- 
mosome is confined to a volume much smaller than the nu- 
clear volume (69). This reflects the requirement that chro- 
mosomes must fit into localized territories (1 1). 

Results from genome -wide 3C (Hi-C) experiments 
showed that the contact probability P c (s) between loci sepa- 
rated by a genomic distance s follows a power law of P c (s) ~ 
l/f. The exponent a is ~ 1 .08 at genomic distances between 
0.5 and 7 Mbp, when averaged across all chromosomes in a 
human cell line (3). Further analyses showed that chromo- 
somes 1 1 and 12 exhibit the average human genome scaling 
behavior, with an exponent a ~ 1.08 (312-17) while expo- 
nents for chromosomes X and 19 deviate from the average 
values significantly, with a ~ 0.93 and ~1.30, respectively 
(12). Similar results were obtained from a different Hi-C 
study (18) (see also (12) for analyses). 

In order to gain understanding of the principles of spa- 
tial organization of chromatin, several polymer models have 
been developed (3,12-17,19-21). The fractal globule (FG) 
model (316) offers an explanation of the scaling of P c (s) and 
R 2 (s) with s at short genomic distances, although it does not 
account for the leveling-off effects observed in FISH studies 
(69). The FG model also does not explain the observed vari- 
ation in a among different chromosomes. By attaching dif- 
fusible binders to chromatin, the Strings and Binders Switch 
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(SBS) model can account for both the leveling-off effects 
and the heterogeneous scaling of a (12). However, individ- 
ual scaling properties can exist only under carefully tuned 
conditions of binder concentrations and binding site distri- 
butions, which are unknown a priori. In addition, the SBS 
model does not exhibit multiple scaling exponents occur- 
ring simultaneously under one set of conditions. 

The most important factor that determines how chromo- 
somes fold in the cell nucleus is the amount of available 
space. In a recent study, spatial constraints were shown to 
be sufficient to produce the overall structural architecture of 
budding yeast genome (22), although the general effects of 
spatial confinement on chromatin folding of human genome 
are unknown. Polymer models can provide important in- 
sights into chromatin compaction in cell nucleus (19-21). 
However, a major obstacle in studying chromatin fibers con- 
fined in a small volume is the difficulty in generating a large 
number of unbiased model chromatin fibers in the form of 
self-avoiding chains with appropriate physical and spatial 
properties (23-25). 

In this study, we examined the effects of spatial con- 
finement on chromosome folding using the constrained 
self-avoiding chromatin (C-SAC) model. We developed a 
novel algorithm that can generate large ensembles of di- 
verse model chromatin chains in severe spatial confinement, 
with full excluded volume effect incorporated. We find that 
spatial confinement is plausibly responsible for much of 
the observed overall scaling behavior of human chromo- 
some folding. The heterogeneous ensemble of folded model 
chromatin chains under spatial confinement also predicts 
chromosome-specific scaling relationships, as well as forma- 
tion of highly interactive substructures that might give rise 
to the formation of topological domains. Our findings high- 
light the importance of nucleus size in regulating the folding 
landscape of chromosomes. 

MATERIALS AND METHODS 

Model and parameters 

In our C-SAC model, a chromatin fiber is represented as a 
self-avoiding polymer chain consisting of beads. Each bead 
has a diameter of 30 nm (26,27) and is 3000 bp long (28,29). 
Every five beads form a persistence unit, which corresponds 
to a persistence length of 150 nm (Figure 1A) (29). Our 
model chain is 4996 beads long, equivalent to about 1 5 Mbp 
of DNA. 

Each chromatin chain is generated in a confined space of 
nucleus, which we modeled as a sphere. The sphere diameter, 
D, is selected to be proportional to the size of the human cell 
nucleus. We assumed an average nucleus size of a diameter 
of ~11 |Jim for 6 billion base pairs of human DNA (1). The 
diameter of the nucleus for a 15 Mb long chromatin chain 
is therefore about 1.5 |xm. With this model, we grow our 
chromatin chains sequentially in a sphere of a diameter of 
D — 1.5 |Jim (Figure 1A). We overcame the difficulties of 
generating folded chromatin chains inside a small volume 
by sequentially growing self-avoiding chains one persistence 
unit at a time using the technique of geometric sequential 
importance sampling (see Supplementary Information for 
more details) (23-2530-33). Subsequently, D was changed 
to D = 2.5, D = 5.0, D = 7.5, D = 10.0, D = 30.0 and D = 




Average 



Figure 1. The physical model of C-SAC chains and its scaling properties. 
(A) Schematic representation of the C-SAC model. A chromatin fiber is 
represented by a self-avoiding polymer chain with a persistence length L p . 
Solid spheres are the beads at the boundaries of a persistence unit. Spheres 
in-between are the interpolated beads inside a unit of L p . Polymers are 
grown as chains inside a spherical confined space of a diameter D. Beads 
are not allowed to cross each other or grow beyond the boundary of the 
spherical volume. (B) The scaling of mean-square spatial distance R 2 (s) 
from 10 000 chains of length 1000L P in logio scale. R 2 (s) follows a power 
law of ~s 2v , with v ~ 0.34 (95% confidence interval: [0.30, 0.38]), simi- 
lar to measured v of ~ 0.33 (6,9). (C) The scaling of contact probability 
P c (s). Pc(s) follows a power law of with a ~ 1.05 (95% confidence 

interval: [1.15, 0.95]), similar to the measured a of 1.08 (3). (D) Compari- 
son of exponent a of contact probability, P c (s), between C-SAC and Hi-C 
data (3). Values of a for different chromosomes from references (3,12) were 
compared to those calculated separately for different clusters in the C-SAC 
population of chromatin chains. Black bars denote the experimentally ob- 
served exponent a for Chr 19, Chr 11, Chr X and the average a across all 
chromosomes in human genome. Grey bars are as from the corresponding 
C-SAC clusters and the average a of entire population. 



500.0 |xm to explore the effects of size of confined space on 
the spatial organization of chromatin. 

Growing chromatin chains using geometric sequential impor- 
tance sampling 

The chromatin chains in 3D space are generated following 
a chain growth approach (23-2530-33). A chromatin chain 
contains n persistence units, with the location of the rth per- 
sistence unit denoted as x, = (a t , b t , c t ) e R 3 . The configu- 
ration x of a full chromatin chain with n persistence units 
is: 



x — {x\ , 



X n ^). 



Our target distribution tt(jc) is the uniform distribution of 
all spatially realizable chromatin chains within the given 
confinement. To generate a chromatin chain, we grow the 
chain one persistence unit at a time, ensuring the self avoid- 
ing property along the way, namely, xi ^ Xj for all i ^ j. We 
use a k — 100-state off-lattice discrete model (see (24-2530- 
33) for more details). The new persistence unit added to a 
growing chain with the current persistence unit located at x, 
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is placed at x, + \ , which is a persistence length L p distance 
away from x t . x, + i is randomly taken from one of the un- 
occupied /:-sites neighboring x t . As random selection from 
available empty neighboring sites introduce bias for sam- 
pling from Tr(x), we keep track of the bias and assign each 
successfully generated chain a proper weight w(x) . Details 
can be found in references (2430-31). 

Each persistence unit further contains [(L p /d{) — 1] num- 
ber of monomer beads, where L p — 150 nm, and the fiber 
diameter d{ is 30 nm. These monomers are connected by 
a chain, and their positions are interpolated as if they are 
on a rigid rod (Figure 1A). This is to mimic the persistence 
behavior of the chromatin fiber. We again enforce the self- 
avoiding property, such that these beads will not intersect 
with any other beads in the partial chain that has already 
been grown. All together, there are N = N + (N - 1 )[(L v /d ( ) 
- 1] = 1000 + 999 [(150/30) - 1] = 4996 monomer beads 
for a N — 1000Lp unit long chain. For larger confinement 
space, we generated chains up to N— 8100L P . 

Method validation 

Scaling of C-SAC chains without confinement. We first 
used our geometric sequential importance sampling tech- 
nique to generate free space self-avoiding C-SAC chains 
without confinement, as their scaling behavior is well un- 
derstood (34). We generated 10 000 C-SAC chains of dif- 
ferent length N, for N e {100, 200, . . . , 1000}. The scaling 
relationship R(N) ~ TV and P c ~ N" are shown in Supple- 
mentary Figure 1. The scaling exponents are found to be v 
~ 0.59 and a ~ —1.88, which are very close to the expected 
values of v ~ | and a r — 3v (34). 

RESULTS 

C-SAC model gives observed scaling behavior of human chro- 
mosomes 

We generated ensembles of 10 000 independent self- 
avoiding model chromatin chains for different chain length 
N of 50, 100, 200 and then up to 1000, with increments of 
100 confined to a region of D — 1.5 |jim. Our C-SAC model 
chromatin chains exhibit experimentally observed scaling 
properties. The mean-square spatial distance R 2 (s) of par- 
tial chains of length s from 10 000 chains of length N — 
1000Lp follows the relationship of R 2 {s) ~ s 2v , with an ex- 
ponent v of ~0.34 at shorter genomic distances (95% con- 
fidence interval of [0.30, 0.38] is obtained by bootstrapping 
10 000 chains for 10 000 times), but levels off with v = 0 at 
larger genomic distances. The experimentally observed v = 
0.33 was derived from FISH data between 0.4 Mb and 2.0 
Mb (6). In our C-SAC model, v — 0.34 is derived accord- 
ingly between 5 and 25L P , by matching the onset points of 
the leveling-off effect (10 Mb in the FISH study, and 125L p 
in C-SAC chains) (6) (Figure IB, see Supplementary Infor- 
mation for additional details and results). Since the mass 
density of chromatin and how it varies in different loci and 
different chromosomes are unknown, the regime that the 
exponents are extracted are not directly comparable by ge- 
nomic distance to the experimental data. The mass den- 
sity used in this study is an average property, and it may 



differ from the actual mass density at the loci measured 
in the FISH experiments (6). As 125L P is where C-SAC 
chains levels-off and 10 Mb is the genomic length where 
the leveling-off is observed in FISH experiments (6), we 
matched 125L P to 10 Mb. This allowed us to calculate the 
exponent v in the same regime of the FISH experiments (6). 
To characterize the scaling relationship of contact proba- 
bility P c (s) and contour length s between two loci, we har- 
vested partial chains of length s from independent ensem- 
bles of different full chain lengths and estimated P c (s). As 
contact probability P c (s) between loci of s genomic distance 
were derived from fragments from different chromosomes 
in Hi-C studies (3), partial chains from independent ensem- 
bles of varying full lengths are necessary to remove self cor- 
relations, which may occur when subchains are taken from 
the same ensemble of chains with a fixed full length as in 
(3). Our C-SAC model can reproduce the scaling relation- 
ship of contact probability P c (s) ~ l/s a , with an exponent a 
of ~1 .05 (Figure 1C), which is in excellent agreement with 
a ~ 1.08 measured in Hi-C studies (3). 

Our C-SAC model also captures observed deviations in 
a from the average in individual chromosomes (12). After 
clustering the 10 000 C-SAC chains of length N= 1000L P 
according to their spatial similarities measured in pairwise 
bead distances (see Supplementary Information), the result- 
ing 20 clusters have exponent a ranging from 0.79 to 1.3 
(Supplementary Table 1). The exponents of these clusters 
give the full range of a observed experimentally. For exam- 
ple, exponents of cluster 10, 15 and 17 agree well with those 
of Chromosome 19, X and 11/12, respectively (Figure ID) 
(312). These results are obtained without using any charac- 
teristics specific to Chromosome X, 19, 11 or 12. Complex 
scaling property of human genome arises fully from struc- 
tural clusters resulting from the spatial confinement. Over- 
all, our results indicate that the restriction of volume im- 
poses strong constraints and chromatin chains under such 
confinement exhibit experimentally observed scaling behav- 
ior of human chromosomes. 

Nuclear size determines chromosomal scaling behavior 

To examine the effects of the spatial confinement, we gen- 
erated independent ensembles of 10 000 C-SAC chains of 
length N inside a sphere D. Here A^ is varied from 50, 100 
and then up to 1000, with increments of 100. The sphere di- 
ameter D takes the value of 2.5, 5.0 and 7.5 u,m, in addition 
to 1.5 |xm. We independently generated different ensembles 
of 10 000 C-SAC chains at each of the combination of N 
and D values. Altogether, we have 4 x 11 = 44 independent 
ensembles of 10 000 C-SAC chains for calculating contact 
probability. We used partial chains of length s from the en- 
semble of 10 000 chains of N = 1000 of different D for the 
calculation of mean-square spatial distance R 2 {s), following 
the approach used in the FISH studies (69). We found that 
both exponents a and v increase with D (Figure 2A and B). 
Furthermore, chromatin chains tend to adopt more open 
conformation as D increases. At the same time, the leveling- 
off effect at larger genomic distances disappears (Figure 
2B). Further clustering of chromatin structures (see Supple- 
mentary Information) at different nuclear sizes showed that 
even with the smallest nucleus size of D — 1.5 jjim, there 
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Figure 2. Size of confinement affects the scaling behavior of human chro- 
mosomes. (A) Mean-square spatial distance R 2 (s) versus genomic distance 
s for different confinement sizes D. For larger nuclei, chromatin chains have 
larger exponent v. The leveling-off effects disappear as the confinement size 
increases. (B) Contact probability P c (s) versus chain length s for different 
confinement sizes D. For larger nuclei, chromatin chains have larger expo- 
nent a, indicating more open conformations. 



exists a substantial amount of open chromatin structures 
(10.9%), while the compact structures and in-between struc- 
tures are 18.6 and 70.5% of the population, respectively. As 
the size of the nucleus increases, the percentage of open-like 
structures in the population increases. These results there- 
fore suggest that nuclear size is a major factor in influencing 
the overall folding landscape of chromatin, via modulation 
of the spatial confinement scale D. 

Formation of highly interactive substructures upon confine- 
ment and topological domains 

We used the C-SAC model to further explore structural 
properties of chromatin fibers. Topological domains were 
previously observed in electron microscopy studies (1135- 
36) and in recent 3C-based studies (Hi-C) (3738). Such do- 
mains are distinctive regions along the chromatin chain with 
significantly elevated interactions within region (37). Their 
DNA content range from a few kbp to 1 Mbp, and they 
occupy a volume of 300-800 nm in diameter (39). To exam- 
ine whether C-SAC chains contain domain-like substruc- 
tures, we calculated the number of consecutive persistence 
units in spheres of 800 nm diameter along the chromatin 
chain. If a sphere contains chromatin fragments that con- 
tain more than 400 kbp DNA, it is regarded as a highly in- 
teractive substructure. We further define two types of sub- 
structures: (i) interactive substructures in which more than 
20% of their persistence units are in spatial proximity with 
those of other interactive substructures, and (ii) indepen- 
dent substructures in which none of their units are in spa- 
tial proximity with any units of other substructures (Figure 
3A). 

On average, there are ~6.5 substructures per chain, which 
occupies around 21% of the entire 15 Mbp C-SAC chain. 
41% of these substructures are interactive, whereas the rest 
of them are independent substructures (Figure 3B). Exis- 
tence of these highly interactive substructures are also ob- 
served from interaction matrices and 3D conformations of 
individual chains (Figure 3C and D). 

In summary, there exists distinct substructures in C-SAC 
chromatin chains with elevated interactions. These results 
are observed without requiring special simulation condi- 
tions or specific binding sites as in other chromatin mod- 
els (31216). Their existence suggests that the confinement of 
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Figure 3. Formation of domain-like substructures upon confinement. 
(A) Illustration of substructures on C-SAC chains, where consecutive 
monomers are contained in spheres of 800 nm diameter (grey circles). Two 
independent substructures with no interaction between them, as well as 
two interactive substructures with >20% of their monomers participat- 
ing interaction (interface shaded) are shown. (B) The distribution of num- 
ber of substructures per chain containing different amount of DNA for 
both independent and interactive categories are shown. (C) A random C- 
SAC chromatin chain with independent substructures. The rotated and 
zoomed-in substructure shows a singular domain-like conformation. The 
domain-like substructures can also be seen in the corresponding distance 
matrices, where the spatial distances between different loci of the C-SAC 
model chains are color coded, with darker shade representing interactions 
between chromatin beads. The chromatin chain contains two highly in- 
teractive substructures that do not interact with each other. (D) A ran- 
dom C-SAC chromatin chain with substructures are shown. There are two 
small interactive substructures, as can be seen in the rotated and zoomed- 
in conformation. The domain-like substructures can also be seen in the 
corresponding distance matrices, where the spatial distances between dif- 
ferent loci of the C-SAC model chains are color coded, with darker shade 
representing interactions between chromatin beads. The chromatin chain 
contains two highly interactive substructures that interact with each other. 
Circles highlight regions of interactions. 



the cell nucleus is sufficient to induce tentative formation of 
highly interactive substructures along the chromatin chains, 
which could further give rise to the formation of topological 
domains. 



Scaling behavior of human chromosomes is not altered by ran- 
dom binder-mediated looping interactions 

Several polymer models of long-range chromatin organi- 
zation are based on the introduction of explicit looping 
probability or looping through binder-mediated interac- 
tions ( 12 1 7). To assess how chromatin looping in addition to 
confinement would affect the scaling behavior of chromatin 
chains, we distribute different numbers of binding sites ran- 
domly along the chromatin chains, which cover from 10 to 
50% of the total number of persistence units in the chro- 
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Figure 4. Scaling exponents a and v when different fractions of C-SAC 
chromatin chains are covered by binding sites. The binding energy is as- 
signed to 6ksT(12) for two relative temperatures ( T in grey, 107" in black). 
Neither scaling exponents experiences significant changes as the binder 
coverage increases. 

matin fiber. Chromatin structures with a large number of 
binding sites in spatial proximity are subject to binder- 
mediated interactions. These structures will then have lower 
energy and therefore higher probability of presence in the 
chromatin population. We calculated the distribution of the 
chromatin chains with such binder interactions, in which 
the binding energy of connecting two interacting sites is as- 
signed to be 6/c B 7T1240) (see Supplementary Information). 
This allows us to assess the scaling properties of different 
populations of chromatin chains under different looping 
conditions. 

Our results showed that there is virtually no change in the 
scaling exponents a and v in C-SAC chains after introduc- 
ing binders compared to the original C-SAC chains, where 
the only constraint is the spatial confinement of the cell nu- 
cleus (Figure 4). These results indicate that random self- 
avoiding chromatin chains folded inside a confined space 
have an intrinsic propensity to form loops, without the ex- 
plicit introduction of additional binders. Overall, our results 
indicate that the confinement at the scale D is the dominant 
factor in determining the average scaling behavior of chro- 
matin structures. 

Relevant size regime of chromatin confinement for the spatial 
organization of chromosomes 

Chromosomes are found to occupy localized territories of 
the size of ~2 jjtm diameter ( 1 1 ). To span a chromosome ter- 
ritory, only a short fragment of chromatin fiber (ca. 1 50 kb, 
assuming 30 nm diameter and 1 50 nm persistence length) is 
required, which is well below the range of 0.5-7.0 Mb mea- 
sured in Hi-C studies (3). Studying the scaling behavior of 
the self-avoiding polymer chains in the correct confinement 
is therefore the key to construct the relevant model for un- 
derstanding chromosome folding. We used C-SAC model 
to explore the relationship between the size of confinement 
and the scaling properties of confined chromatin chains, as 
the calculated scaling exponents of C-SAC chains in the rel- 
evant size regime (a ~ 1.05) is well below the theoretical 
scaling exponent of self-avoiding polymer chains in confine- 
ment (a ~ 1.50) (34). 

We generated independent ensembles of C-SAC chains 
in equilibrium and calculated the relationship between the 
chain length and the end-to-end distance, as well as the re- 
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Figure 5. Scaling properties of self-avoiding C-SAC chains in confinement. 
(A) Relationship between the mean end-to-end distance and the chain 
length. Each data point is an average of 10 000 chains of different length 
under specific confinement of diameter D. As D increases, the scaling be- 
havior of self-avoiding walks converges to that of ideal SAW (v = 0.6). (B) 
Relationship between mean contact probability and partial chain length s. 
Each data point is an average of 10 000 chains of different length N under 
different confinement of diameter D. 



lationship between contact distance and the contact prob- 
ability. In total, we generated chains with different length 
/V of 50 , 100 , 1000 at increment of 100 each with 
five different confinements D (D — 1.5, 2.5, 10, 30 and 500 
|xm). We also generated C-SAC chains of length N from 
50 to 8100 at different increments for two different con- 
finements D (D — 5.0 and 7.5 |xm) (Figure 5). We found 
that chromatin chains exhibit confinement-dependent scal- 
ing behavior, with v ranging from 0.30 to 0.60 (Figure 5A). 
That is, the mean end-to-end distance of self-avoiding C- 
SAC chains in a spherical confinement is a function of both 
the chain length N and the confinement diameter D, when 
the length of N is larger than D (see Supplementary Fig- 
ure S2). This confinement-dependent regime is illustrated 
in Figure 5 for both mean end-to-end distance and con- 
tact probability. Figure 5B includes data presented in Fig- 
ure 2B, with additional data for D of 10, 30 and 100 u,m 
to depict comprehensively the relationship between a and 
the genomic distance. This helps to illustrate the important 
issue of the cross-over regime for self-avoiding chromatin 
and the convergence of the scaling exponent a. The asymp- 
totic relationship of a — 3v (34) is well-satisfied at larger D 
value, but less so at smaller D, as the leveling-off effects take 
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place at shorter chain lengths with more severe confinement 
at smaller D. 

Severe spatial confinement has pronounced effects on 
the conformations of self-avoiding polymers. Overall, we 
find that the effective scaling exponent slowly changes with 
increasing D, reflecting a rather slow convergence to the 
asymptotic behavior expected from simple polymer scaling 
theory (34). 



DISCUSSION 

Chromosomes reside within the severely confined space of 
the cell nucleus. However, the direct effects of nuclear con- 
finement on chromatin folding and compaction are un- 
known. A major challenge is the extreme difficulty in ad- 
equate sampling of long self-avoiding chromatin chains in 
the confinement of the cell nucleus. The C-SAC model and 
the novel sampling technique developed in this study en- 
abled us to generate a large number of chromatin confor- 
mations in confinement. 

Our results showed that the spatial confinement of ~15 
Mb chromatin within regions of diameter D of 1 .5 |xm gives 
rise to the chromosomal scaling relationships of the average 
a ~ 1 .05 (3) and the average v ~ 0.34, as well as the leveling- 
off effects observed experimentally (69). Our model also 
captured the complex folding behavior of the chromosome- 
specific variation in scaling (12). In addition, the tentative 
formation of domains (1135-38) also emerged in C-SAC 
model as highly interactive substructures, without the need 
of introducing additional binder molecules and fine tun- 
ing of their concentrations. These interactive substructures 
could be stabilized by introducing more specific interactions 
through evolutionary selection pressure to form functional 
topological domains. 

We found that D, and therefore nuclear size, is a major 
factor in influencing the overall folding landscape of chro- 
matin. As nuclear size changes, there are significant differ- 
ences in the chromosome architecture, which are reflected 
in variations in the scaling exponents. These conclusions 
are in good agreement with results from Hi-C studies using 
different cell lines (3,12,18,37). For example, lines of differ- 
entiated cells (GM06990 (3), GM12878 (18), IMR90 (37)) 
have similar overall average scaling behavior, with a ~ 1.08 
(3,12) while embryonic stem cells (hESC) (37) behave differ- 
ently, with an a close to 1.6 (12). A characteristic property 
of an hESC nucleus is that it occupies almost the entire cell 
volume (4142) and is plastic and deformable (43). This pro- 
vides an enlarged space for chromosome organization. As a 
result, hESC chromatin is largely diffuse (41). Our calcula- 
tion also showed an increase in a when the confined space 
is enlarged (Figure 2B). This observed variation in scaling 
corresponds well with the confinement of different nuclear 
sizes. 

The average compactness of the chromatin chains and 
the fractions of open, compact and in-between chromatin 
structures are all different when the nuclear size is changed. 
Nuclear size likely alters the overall structural organization 
of chromosomes, allowing previously unlikely long-range 
interactions to occur, at the same time prohibiting certain 
other genomic interactions present at a different nucleus 



size. Thus, nuclear confinement may bias distant sites to- 
wards spatial proximity. 

Our results showed that randomly placed binders do not 
directly affect the scaling behavior. Biological binders such 
as CTCF may play more specific roles of modifying or bi- 
asing chromosomes toward formation of specific domains 
required for cell function44-46). Future work on the selec- 
tion of properly placed CTCF binding and its effects will 
likely be fruitful for understanding the effects of biochemi- 
cal binding on spatial organization of chromosome. 

We compared predictions from C-SAC models with those 
from other chromatin models. As experimentally observed 
v ~ 0.3 deviates significantly from the expected v of 0.5 
for sub-chains in equilibrium globule, chromatin fibers were 
conjectured to be in non-equilibrium fractal globule (FG) 
state, in which the exponent of v ~ 0.3 would be retained 
at every scale (316). The lack of leveling-off effects in P c {s) 
with s observed in (3) is consistent with the prediction of 
the FG model. However, leveling-off effects are observed 
in FISH studies on different chromosomes at several dif- 
ferent length scales (69). These leveling-off effects are not 
accounted for by the FG model. In addition, the significant 
variation of exponent a among different chromosomes of 
human cells (312) is not explained by the FG model. 

An important consideration in studying the scaling rela- 
tionship of chromosomes is the relevant size regime dictated 
by experimental observations. An average of 50-100 Mbp 
chromosome occupies a territory of size ~2 |jim (11). As a 
result, a chromatin must traverse back and forth many times 
in the chromosome territory, and severe spatial confinement 
is at play and will have pronounced effects on the folding 
and scaling of chromatin fibers. General asymptotic scaling 
analysis of polymers is overly simplistic to offer much in- 
sight under such strong effects of finite sizes. Conventional 
simulation studies based on Metropolis Monte Carlo are 
also challenged to generate adequate samples to study the 
equilibrium ensemble of severely confined chromatin chains 
in the relevant regime. 

Simulation using the novel technique of geometric se- 
quential importance sampling allows the effects of finite size 
of confinement to be examined in detail. Our results offer an 
alternative explanation on the scaling relationship of chro- 
mosomes to the existing FG (3,16) and SBS (12) models. 
Overall, our results show that equilibrium ensemble of C- 
SAC chromatin chains under severe confinement of scale D 
= 1.5 |xm exhibit scaling behavior consistent with known 
experimental data, which are different from that of asymp- 
totic random chains in the relevant biological scale. 

A useful result that can be inferred from our analysis is 
that chromosomes are restricted via confinement of sub- 
chromosomal regions of size about 1 5 Mb, each within a D 
of about 1.5 jjim-diameter region. Therefore, it may be use- 
ful to consider the nucleus to be made up of close-packed 
regions of size D, each containing ~15 Mb of DNA. For 
example, one can consider whole chromosomes to be made 
up of individual units of 1 5 Mb of DNA, confined to spher- 
ical regions of diameter ~1.5 |xm. The whole human nu- 
cleus containing ~6 Gbp of DNA can be considered to be 
a collection of ~6000/15 = 400 such units, which can be 
fit into a nucleus of diameter ~400 1/3 Z> » TAD ~ 11 |jim, 
compatible with the observed size of human cell nuclei (1) 
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(see Supplementary Information). Therefore, the subchro- 
mosome confinement parameter D, namely, the size of a re- 
gion containing 15 Mb of DNA, is an important parameter 
in our structural model. 

As spatial confinement is a dominant factor in deter- 
mining chromosome folding, the specific epigenetic state of 
genes and transcription activities in different cell types are 
likely to be influenced by the degree of nuclear confinement. 
Cell nucleus size at different developmental stages or phys- 
iological states may be altered to induce different chromo- 
some folding landscape, enabling different genetic program- 
ming to be activated. Overall, how nuclear size and shape 
relate to cell size and shape, and how their relative ratio or 
pattern regulate the epigenetic programs of the cells at dif- 
ferent developmental stages are important problems requir- 
ing further investigations. 

Although our approach can generate a large ensemble 
of chromatin chains under spatial confinement, there still 
exists uncertainty in the physical parameters used in the 
current C-SAC model, including the persistence length, the 
chromatin fiber diameter and the mass density (47). In addi- 
tion, current chromatin models are based on growing a sin- 
gle chromosome chain, and cannot be used to study inter- 
chromosomal interactions. Another question is how the 1 5 
Mb sequence scale and the parameter D are controlled in 
the cell. These issues will likely be resolved when chromoso- 
mal properties are better understood and the C-SAC algo- 
rithm is further improved. It is interesting that our simplistic 
model can capture complex folding characteristic of human 
genome. The current study highlighted the importance of 
spatial confinement in dictating the chromatin folding land- 
scape. With the accumulation of high resolution chromo- 
some conformation capture data, it is envisioned that more 
specific spatial information inferred from 3C-based studies 
can be incorporated into the C-SAC model, and realistic 
ensemble of chromatin conformations reflecting 3C-based 
information can be reconstructed to gain insight into the 
structural basis of gene regulation and expression. 
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